Goto

Collaborating Authors

 membership probability


Learning Multi-Order Block Structure in Higher-Order Networks

arXiv.org Artificial Intelligence

Higher-order networks, naturally described as hypergraphs, are essential for modeling real-world systems involving interactions among three or more entities. Stochastic block models offer a principled framework for characterizing mesoscale organization, yet their extension to hypergraphs involves a trade-off between expressive power and computational complexity. A recent simplification, a single-order model, mitigates this complexity by assuming a single affinity pattern governs interactions of all orders. This universal assumption, however, may overlook order-dependent structural details. Here, we propose a framework that relaxes this assumption by introducing a multi-order block structure, in which different affinity patterns govern distinct subsets of interaction orders. Our framework is based on a multi-order stochastic block model and searches for the optimal partition of the set of interaction orders that maximizes out-of-sample hyperlink prediction performance. Analyzing a diverse range of real-world networks, we find that multi-order block structures are prevalent. Accounting for them not only yields better predictive performance over the single-order model but also uncovers sharper, more interpretable mesoscale organization. Our findings reveal that order-dependent mechanisms are a key feature of the mesoscale organization of real-world higher-order networks.


ChronoFlow: A Data-Driven Model for Gyrochronology

arXiv.org Artificial Intelligence

Gyrochronology is a technique for constraining stellar ages using rotation periods, which change over a star's main sequence lifetime due to magnetic braking. This technique shows promise for main sequence FGKM stars, where other methods are imprecise. However, models have historically struggled to capture the observed rotational dispersion in stellar populations. To properly understand this complexity, we have assembled the largest standardized data catalog of rotators in open clusters to date, consisting of ~7,400 stars across 30 open clusters/associations spanning ages of 1.5 Myr to 4 Gyr. We have also developed ChronoFlow: a flexible data-driven model which accurately captures observed rotational dispersion. We show that ChronoFlow can be used to accurately forward model rotational evolution, and to infer both cluster and individual stellar ages. We recover cluster ages with a statistical uncertainty of 0.06 dex ($\approx$ 15%), and individual stellar ages with a statistical uncertainty of 0.7 dex. Additionally, we conducted robust systematic tests to analyze the impact of extinction models, cluster membership, and calibration ages on our model's performance. These contribute an additional $\approx$ 0.06 dex of uncertainty in cluster age estimates, resulting in a total error budget of 0.08 dex. We estimate ages for the NGC 6709 open cluster and the Theia 456 stellar stream, and calculate revised rotational ages for M34, NGC 2516, NGC 1750, and NGC 1647. Our results show that ChronoFlow can precisely estimate the ages of coeval stellar populations, and constrain ages for individual stars. Furthermore, its predictions may be used to inform physical spin down models. ChronoFlow will be publicly available at https://github.com/philvanlane/chronoflow.


Improving behavior profile discovery for vehicles

arXiv.org Artificial Intelligence

-- Multiple approaches have already been proposed to mimic real driver behaviors in simulation. This article proposes a new one, based solely on the exploration of undisturbed observation of intersections. From them, the behavior profiles for each macro-maneuver will be discovered. Using the macro-maneuvers already identified in previous works, a comparison method between trajectories with different lengths using an Extended Kalman Filter (EKF) is proposed, which combined with an Expectation-Maximization (EM) inspired method, defines the different clusters that represent the behaviors observed. This is also paired with a Kullback-Liebler divergent (KL) criteria to define when the clusters need to be split or merged. Finally, the behaviors for each macro-maneuver are determined by each cluster discovered, without using any map information about the environment and being dynamically consistent with vehicle motion. By observation it becomes clear that the two main factors for driver's behavior are their assertiveness and interaction with other road users.


What Drives Online Popularity: Author, Content or Sharers? Estimating Spread Dynamics with Bayesian Mixture Hawkes

arXiv.org Artificial Intelligence

The spread of content on social media is shaped by intertwining factors on three levels: the source, the content itself, and the pathways of content spread. At the lowest level, the popularity of the sharing user determines its eventual reach. However, higher-level factors such as the nature of the online item and the credibility of its source also play crucial roles in determining how widely and rapidly the online item spreads. In this work, we propose the Bayesian Mixture Hawkes (BMH) model to jointly learn the influence of source, content and spread. We formulate the BMH model as a hierarchical mixture model of separable Hawkes processes, accommodating different classes of Hawkes dynamics and the influence of feature sets on these classes. We test the BMH model on two learning tasks, cold-start popularity prediction and temporal profile generalization performance, applying to two real-world retweet cascade datasets referencing articles from controversial and traditional media publishers. The BMH model outperforms the state-of-the-art models and predictive baselines on both datasets and utilizes cascade- and item-level information better than the alternatives. Lastly, we perform a counter-factual analysis where we apply the trained publisher-level BMH models to a set of article headlines and show that effectiveness of headline writing style (neutral, clickbait, inflammatory) varies across publishers. The BMH model unveils differences in style effectiveness between controversial and reputable publishers, where we find clickbait to be notably more effective for reputable publishers as opposed to controversial ones, which links to the latter's overuse of clickbait.


A multi-modal representation of El Ni\~no Southern Oscillation Diversity

arXiv.org Artificial Intelligence

The El Niรฑo-Southern Oscillation (ENSO), characterized by anomalous sea surface temperature (SST) in the tropical Pacific, exhibits notable diversity in its temporal evolution and spatial distribution of anomalies. The El Niรฑo events of 1982-83 and 1997-98, for instance, recorded exceptionally high sea surface temperature anomaly (SSTA) values in the eastern equatorial Pacific, whereas the El Niรฑo of 2002-03 were notably less extreme and primarily restricted to the central equatorial Pacific (McPhaden, 2004). Despite each being categorized as an El Niรฑo, the 2002-03 event exhibited global climate conditions distinct from those of the earlier two events. In order to describe these event-to-event differences, El Niรฑo events have been categorized as Eastern Pacific (EP), and Central Pacific (CP) types (Capotondi et al., 2020). EP El Niรฑo events typically have their peak SSTA in the Eastern Pacific, exhibit stronger intensities, and a largely reduced zonal thermocline slope, resulting in the discharge of warm water from the equatorial thermocline. In contrast, CP events show peak SSTA in the Central Pacific and are comparatively weaker with more limited changes in zonal thermocline slope and reduced warm water discharge (Kug, Jin, and An, 2009; Capotondi, 2013). Despite considerable research, the underlying causes of ENSO diversity remain elusive (Lee and McPhaden, 2010; Capotondi et al., 2015; Capotondi et al., 2020). And while some general circulation models (GCMs) do exhibit ENSO event-to-event differences, their representation of ENSO diversity appears to be model dependent and is often different in intensity, pattern and duration than observed (Cai et al., 2018). The different types of ENSO events have substantially different downstream impacts on the global climate and dynamics (Strnad et al., 2022).


Multi-study R-learner for Heterogeneous Treatment Effect Estimation

arXiv.org Machine Learning

Estimating heterogeneous treatment effects is crucial for informing personalized treatment strategies and policies. While multiple studies can improve the accuracy and generalizability of results, leveraging them for estimation is statistically challenging. Existing approaches often assume identical heterogeneous treatment effects across studies, but this may be violated due to various sources of between-study heterogeneity, including differences in study design, confounders, and sample characteristics. To this end, we propose a unifying framework for multi-study heterogeneous treatment effect estimation that is robust to between-study heterogeneity in the nuisance functions and treatment effects. Our approach, the multi-study R-learner, extends the R-learner to obtain principled statistical estimation with modern machine learning (ML) in the multi-study setting. The multi-study R-learner is easy to implement and flexible in its ability to incorporate ML for estimating heterogeneous treatment effects, nuisance functions, and membership probabilities, which borrow strength across heterogeneous studies. It achieves robustness in confounding adjustment through its loss function and can leverage both randomized controlled trials and observational studies. We provide asymptotic guarantees for the proposed method in the case of series estimation and illustrate using real cancer data that it has the lowest estimation error compared to existing approaches in the presence of between-study heterogeneity.


An astronomer's introduction to NumPyro

#artificialintelligence

Over the past year or so, I've been using JAX extensively for my research, and I've also been encouraging other astronomers to give it a try. In particular, I've been using JAX as the computation engine for probabilistic inference tasks. There's more to it, but one way that I like to think about JAX is as NumPy with just-in-time compilation and automatic differentiation. The just-in-time compilation features of JAX can be used to speed up you NumPy computations by removing some Python overhead and by executing it on your GPU. Then, automatic differentiation can be used to efficiently compute the derivatives of your code with respect to its input parameters.


Extracting Conceptual Knowledge from Natural Language Text Using Maximum Likelihood Principle

arXiv.org Artificial Intelligence

Domain-specific knowledge graphs constructed from natural language text are ubiquitous in today's world. In many such scenarios the base text, from which the knowledge graph is constructed, concerns itself with practical, on-hand, actual or ground-reality information about the domain. Product documentation in software engineering domain are one example of such base texts. Other examples include blogs and texts related to digital artifacts, reports on emerging markets and business models, patient medical records, etc. Though the above sources contain a wealth of knowledge about their respective domains, the conceptual knowledge on which they are based is often missing or unclear. Access to this conceptual knowledge can enormously increase the utility of available data and assist in several tasks such as knowledge graph completion, grounding, querying, etc. Our contributions in this paper are twofold. First, we propose a novel Markovian stochastic model for document generation from conceptual knowledge. The uniqueness of our approach lies in the fact that the conceptual knowledge in the writer's mind forms a component of the parameter set of our stochastic model. Secondly, we solve the inverse problem of learning the best conceptual knowledge from a given document, by finding model parameters which maximize the likelihood of generating the specific document over all possible parameter values. This likelihood maximization is done using an application of Baum-Welch algorithm, which is a known special case of Expectation-Maximization (EM) algorithm. We run our conceptualization algorithm on several well-known natural language sources and obtain very encouraging results. The results of our extensive experiments concur with the hypothesis that the information contained in these sources has a well-defined and rigorous underlying conceptual structure, which can be discovered using our method.


Learning sparse relational transition models

arXiv.org Artificial Intelligence

We present a representation for describing transition models in complex uncertain domains using relational rules. For any action, a rule selects a set of relevant objects and computes a distribution over properties of just those objects in the resulting state given their properties in the previous state. An iterative greedy algorithm is used to construct a set of deictic references that determine which objects are relevant in any given state. Feed-forward neural networks are used to learn the transition distribution on the relevant objects' properties. This strategy is demonstrated to be both more versatile and more sample efficient than learning a monolithic transition model in a simulated domain in which a robot pushes stacks of objects on a cluttered table. Many complex domains are appropriately described in terms of sets of objects, properties of those objects, and relations among them. We are interested in the problem of taking actions to change the state of such complex systems, in order to achieve some objective. To do this, we require a transition model, which describes the system state that results from taking a particular action, given the previous system state.


Classification and clustering for samples of event time data using non-homogeneous Poisson process models

arXiv.org Machine Learning

Classification and clustering for samples of event time data using non-homogeneous Poisson process models Duncan S Barrack a and Simon Preston b a Horizon Digital Economy Research Institute, University of Nottingham, Nottingham, UK. b School of Mathematical Sciences, University of Nottingham, Nottingham, UK. Abstract Data of the form of event times arise in various applications. A simple model for such data is a non-homogeneous Poisson process (NHPP) which is specified by a rate function that depends on time. We consider the problem of having access to multiple independent samples of event time data, observed on a common interval, from which we wish to classify or cluster the samples according to their rate functions. Each rate function is unknown but assumed to belong to a finite number of rate functions each defining a distinct class. We model the rate functions using a spline basis expansion, the coefficients of which need to be estimated from data. The classification approach consists of using training data for which the class membership is known, to calculate maximum likelihood estimates of the coefficients for each group, then assigning test samples to a class by a maximum likelihood criterion. For clustering, by analogy to the Gaussian mixture model approach for Euclidean data, we consider a mixture of NHPP models and use the expectation-maximisation algorithm to estimate the coefficients of the rate functions for the component models and cluster membership probabilities for each sample. The classification and clustering approaches perform well on both synthetic and real-world data sets.